Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster numeric and POSIXct methods for IDate and ITime #1393

Merged
merged 6 commits into from Oct 13, 2017

Conversation

jangorecki
Copy link
Member

@jangorecki jangorecki commented Oct 13, 2015

Closes #1392.
PR not ready as consistency of timezone behaviour was not tested. I've added timezone consistency to current ITime behaviour but this limits the speed-up to UTC timezone of POSIXct type.
More tests welcome, Ideally according to Daylight saving time and time zone best practices.
Below current benchmark.

  • *.default are the original methods
  • the generics will dispatch to my new methods
library(data.table)
library(microbenchmarkCore) # install.packages("microbenchmarkCore", repos="https://olafmersmann.github.io/drat")
as.ITime.default = data.table:::as.ITime.default
as.IDate.default = data.table:::as.IDate.default
intpx = function(x) as.integer(as.POSIXct(x, origin = "1970-01-01", tz = "UTC"))
set.seed(1)
ii = sample(seq(intpx("2012-10-12"), intpx("2015-10-12")), 1e7, TRUE)
nn = as.numeric(ii)
ct = as.POSIXct(nn, origin = "1970-01-01", tz = "UTC")

microbenchmark(times = 10L,
               as.IDate.default(ii, origin="1970-01-01"),
               as.IDate.default(nn, origin="1970-01-01"),
               as.IDate.default(ct),
               as.IDate(ii),
               as.IDate(nn),
               as.IDate(ct))
# Unit: milliseconds
#                                         expr      min       lq     mean   median       uq      max neval
#  as.IDate.default(ii, origin = "1970-01-01") 122.3949 123.5557 136.8920 125.4423 150.1868 173.9983    10
#  as.IDate.default(nn, origin = "1970-01-01") 116.5835 117.9173 123.4366 118.3104 121.4779 144.2163    10
#                         as.IDate.default(ct) 197.6881 198.8312 208.2653 199.2498 226.9576 232.7463    10
#                                 as.IDate(ii) 101.8510 102.8230 113.2321 103.5843 127.8585 131.9020    10
#                                 as.IDate(nn) 146.1235 146.7726 150.7616 147.1008 148.0633 179.3617    10
#                                 as.IDate(ct) 152.6400 154.0221 157.7458 154.7143 158.7362 178.6318    10
microbenchmark(times = 10L,
               as.ITime.default(ii, origin = "1970-01-01"),
               as.ITime.default(nn, origin = "1970-01-01"),
               as.ITime.default(ct),
               as.ITime(ii),
               as.ITime(nn),
               as.ITime(ct))
# Unit: milliseconds
#                                         expr       min        lq      mean    median        uq       max neval
#  as.ITime.default(ii, origin = "1970-01-01") 1888.5241 1897.3788 1904.8904 1906.3941 1914.1901 1923.2183    10
#  as.ITime.default(nn, origin = "1970-01-01") 1886.1342 1888.4144 1901.7510 1898.2280 1911.8567 1931.2231    10
#                         as.ITime.default(ct)  883.3834  885.5955  899.8955  892.9906  915.1113  929.7641    10
#                                 as.ITime(ii)  276.6699  277.0064  283.8398  278.3184  287.2178  318.3477    10
#                                 as.ITime(nn)  180.4939  180.9974  183.0361  182.1190  183.8354  191.7105    10
#                                 as.ITime(ct)  189.3085  191.3544  207.5544  196.2207  202.4649  287.6540    10

@jangorecki jangorecki added the Low label Feb 25, 2016
@mattdowle mattdowle added this to the v1.10.6 milestone Aug 8, 2017
@mattdowle mattdowle modified the milestones: Candidate, v1.10.6 Aug 8, 2017
@mattdowle
Copy link
Member

mattdowle commented Aug 8, 2017

This one is strange. It seems to merge with master ok but it fails its test with :

 Error in if (attr(x, "tzone") %in% c("UTC", "GMT")) as.ITime(unclass(x),  : 
    argument is of length zero

When I fix that locally to catch missing tzone, it still fails 3 of its tests with a date offset issue :

> test(1776.6, as.IDate.default(p), as.IDate(p))
Running test id 1776.6     Test 1776.6 ran without errors but failed check that x equals y:
> x = as.IDate.default(p) 
First 6 of 1000000 :[1] "2015-01-16" "2015-02-24" "2015-05-09" "2015-09-08" "2014-12-24" "2015-09-04"
> y = as.IDate(p) 
First 6 of 1000000 :[1] "2015-01-17" "2015-02-25" "2015-05-09" "2015-09-08" "2014-12-24" "2015-09-05"
Mean relative difference: 6.046997e-05

@mattdowle mattdowle modified the milestones: Candidate, v1.10.6 Oct 13, 2017
@mattdowle mattdowle merged commit 1404654 into Rdatatable:master Oct 13, 2017
@mattdowle mattdowle deleted the idatetime_methods branch October 13, 2017 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants